Constructing Decision Trees for Graph-Structured Data by Chunkingless Graph-Based Induction
نویسندگان
چکیده
A decision tree is an effective means of data classification from which one can obtain rules that are easy to understand. However, decision trees cannot be conventionally constructed for data which are not explicitly expressed with attribute-value pairs such as graph-structured data. We have proposed a novel algorithm, named Chunkingless Graph-Based Induction (Cl-GBI), for extracting typical patterns from graph-structured data. Cl-GBI is an improved version of Graph-Based Induction (GBI) which employs stepwise pair expansion (pairwise chunking) to extract typical patterns from graphstructured data, and can find overlapping patterns that cannot not be found by GBI. In this paper, we further propose an algorithm for constructing decision trees for graphstructured data using Cl-GBI. This decision tree construction algorithm, now called Decision Tree Chunkingless Graph-Based Induction (DT-ClGBI), can construct a decision tree from a graph-structured dataset while simultaneously constructing attributes useful for classification using Cl-GBI internally. Since patterns (subgraphs) extracted by Cl-GBI are considered as attributes of a graph, and their existence/non-existence are used as attribute values in DT-ClGBI, DT-ClGBI can be conceived as a tree generator equipped with feature construction capability. Experiments were conducted on both synthetic and real-world graph-structured datasets showing the usefulness and effectiveness of the algorithm.
منابع مشابه
Pruning Strategies Based on the Upper Bound of Information Gain for Discriminative Subgraph Mining
Given a set of graphs with class labels, discriminative subgraphs appearing therein are useful to construct a classification model. A graph mining technique called Chunkingless Graph-Based Induction (Cl-GBI) can find such discriminative subgraphs from graph structured data. But, it sometimes happens that Cl-GBI cannot extract subgraphs that are good enough to characterize the given data due to ...
متن کاملConstructing a Decision Tree for Graph-Structured Data and its Applications
A machine learning technique called Graph-Based Induction (GBI) efficiently extracts typical patterns from graph-structured data by stepwise pair expansion (pairwise chunking). It is very efficient because of its greedy search. Meanwhile, a decision tree is an effective means of data classification from which rules that are easy to understand can be obtained. However, a decision tree could not ...
متن کاملMining Discriminative Patterns from Graph Structured Data with Constrained Search
A graph mining method, Chunkingless Graph-Based Induction (Cl-GBI), finds typical patterns that appear in graph structured data by the operation called chunkingless pairwise expansion which generates pseudo-nodes from selected pairs of nodes in the data. Cl-GBI enables to extract overlapping subgraphs, while it requires more time and space complexities. Thus, it happens that Cl-GBI cannot extra...
متن کاملConstructing a Decision Tree for Graph Structured Data
Decision tree Graph-Based Induction (DT-GBI) is proposed that constructs a decision tree for graph structured data. Substructures (patterns) are extracted at each node of a decision tree by stepwise pair expansion (pairwise chunking) in GBI to be used as attributes for testing. Since attributes (features) are constructed while a classifier is being constructed, DT-GBI can be conceived as a meth...
متن کاملConstructing Graceful Graphs with Caterpillars
A graceful labeling of a graph G of size n is an injective assignment of integers from {0, 1,..., n} to the vertices of G, such that when each edge of G has assigned a weight, given by the absolute dierence of the labels of its end vertices, the set of weights is {1, 2,..., n}. If a graceful labeling f of a bipartite graph G assigns the smaller labels to one of the two stable sets of G, then f ...
متن کامل